智能论文笔记

Neural Architecture Search using Property Guided Synthesis

Charles Jin , Phitchaya Mangpo Phothilimthana , Sudip Roy

分类：机器学习

2022-05-08

In the past few years, neural architecture search (NAS) has become an increasingly important tool within the deep learning community. Despite the many recent successes of NAS, however, most existing approaches operate within highly structured design spaces, and hence explore only a small fraction of the full search space of neural architectures while also requiring significant manual effort from domain experts. In this work, we develop techniques that enable efficient NAS in a significantly larger design space. To accomplish this, we propose to perform NAS in an abstract search space of program properties. Our key insights are as follows: (1) the abstract search space is significantly smaller than the original search space, and (2) architectures with similar program properties also have similar performance; thus, we can search more efficiently in the abstract search space. To enable this approach, we also propose a novel efficient synthesis procedure, which accepts a set of promising program properties, and returns a satisfying neural architecture. We implement our approach, $\alpha$NAS, within an evolutionary framework, where the mutations are guided by the program properties. Starting with a ResNet-34 model, $\alpha$NAS produces a model with slightly improved accuracy on CIFAR-10 but 96% fewer parameters. On ImageNet, $\alpha$NAS is able to improve over Vision Transformer (30% fewer FLOPS and parameters), ResNet-50 (23% fewer FLOPS, 14% fewer parameters), and EfficientNet (7% fewer FLOPS and parameters) without any degradation in accuracy.

translated by 谷歌翻译

多芯片芯片模块（MCM），而票面上提供性能和能效的单片大芯片减少了机器学习（ML）加速器的设计和制造成本。然而，统计MCM的ML编译器需要最佳，有效地解决复杂的优化问题，以实现这种高性能。其中一个问题是多芯片分割问题，在编译器确定在小芯片的MCM张计算图形操作的最佳分配和安置。作为搜索空间可用芯片的数目和节点的神经网络在数量呈指数级增长分区ML图形的多芯片模块是特别难。此外，由底层硬件施加的约束产生了一个有效解决方案非常稀疏的搜索空间。在本文中，我们提出使用深强化学习（RL）框架来发出可能无效分区候选人，然后由约束求解修正的策略。使用约束求解器可确保RL遇到稀疏空间中的有效解决方案，其经常足以与未经学习的策略相比较少的样本收敛。我们为策略网络制作的架构选择允许我们拓展不同的ML图形。我们的生产规模的模型，BERT，在真实的硬件的评估表明，使用RL政策所产生的分区达到6.11％和5.85％，比吞吐量随机搜索和模拟退火更高。此外，微调预训练RL政策减少了3小时至只有9分钟的搜索时间，同时实现了相同的吞吐量从头训练RL政策。

translated by 谷歌翻译